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[57] ABSTRACT 

Conditional replenishment device in a video encoder having 
a motion estimator (22) providing a predicted block for each 
predefined block based upon estimating the motion between 
the predefined block of the current image and the corre- 
sponding block in the previous image, transformer (16) for 
transforming a prediction error resulting from the difference 
between the predicted block and the predefined block into 
the frequency domain, and quantizer (20) for quantizing the 
coefficients of the prediction error and providing the quan- 
tized coefficients to a video multiplex coding unit (30). Such 
a conditional replenishment device includes a change detec- 
tor for producing a segmentation map in which each pixel of 
the predefined block is marked as moving or stationary, a 
motion determinator for determining whether there is a 
group of moving pixels the number of which exceeds a 
predetermined threshold defining a moving object in the 
image, and a coder for encoding only blocks containing the 
group of moving pixels. 

14 Claims, 2 Drawing Sheets 
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CONDITIONAL REPLENISHMENT DEVICE Another object of the invention is to provide a conditional 

FOR A VIDEO ENCODER image replenishment device incorporated in a H.263 

encoder and based on a sophisticated change detector to 

TECHNICAL FIELD segment the image into changed and unchanged blocks and 

5 then using this segmentation for classifying each block as 

The present invention relates to the video encoding stan- coc jed or not-coded 
dard H.263 developed by the International Telecommunica- 
tion Union (ITU) for very low bit-rate multimedia telecom- BRIEF SUMMARY OF INVENTION 
munition ^d Particularly to a conditional replenishment Accordingly> the corjditional replenishment device of the 
evice or a vi eo enco er. 1Q mvem j on ^ m a encoder comprising motion 
BACKGROUND estimation means providing a predicted block for each 

predefined block based upon estimating the motion between 

The H.263 standard developed by the ITU (International the predefined block of the current image and the corre- 

Telecommunication Union) is a part of its H.324/H323 sponding block in the previous image, transform means for 

recommendations for very low bit-rate multimedia telecom- 15 transforming a prediction error resulting from the difference 

munication. The H.263 coding scheme which is described in between the predicted block and the predefined block into 

"Video Coding for Very Low bit-rate Communication" Draft the frequency domain, and quantizing means for quantizing 

ITU-Recommendation H.263, may 1996, is based on earlier the coefficients of the prediction error in the frequency 

schemes used in H.261 and MPEG-1/2 standards, and using domain and providing the quantized coefficients to a video 

the Hybrid-DPCM concept comprising a motion estimation/ 20 multiplex coding unit, wherein the quantized coefficients are 

compensation mechanism, and transform coding and quan- de-quantized and inverse transformed to give back the 

tization. Each image is divided into blocks of size 16x16 prediction error which is added to the predicted block 

pixels (called macrob locks) and the macroblock in the whereby the result is provided to the motion estimation 

current picture is predicted from the previous picture using means in order to get a new current predicted block. Such a 

motion estimation techniques. After the prediction, the mac- 25 conditional replenishment device comprises change detec- 

roblock is divided into four blocks of size 8x8 pixels. The tion means for producing a segmentation map in which each 

prediction error is then transformed using the Discrete p^el of the predefined block is marked as moving or 

Cosine Transform (DCT) and the resulted coefficients are stationary, motion determination means for determining 

quantized and stored in the bitstream along with the motion whether there is a group of moving pixels the number of 

parameters and other side information. The H.263 standard 30 which exceeds a predetermined threshold defining a moving 

contains several improvements compared to earlier stan- object in the image, and coding means for encoding only 

dards which allow a substantial reduction in the bit -rate blocks containing the group of moving pixels, 

while maintaining the same image quality. These improve- Another aspect of the invention is a process of condition- 

ments make it most suitable for very low bit-rate commu- al]y replenishing the video image wherein the analyzed 

nication (but do not exclude it from being used in high 35 block ^ ^ the Dum ber of moving pixels in the 

bit-rate compression as well). analyzed block exceeds a first predetermined threshold, or if 

The H.263 coder supports several image sizes and does the number of moving pixels in a group of blocks adjacent 

not have limitations on the bit-rate. These features allow the to the analyzed block exceeds a second predetermined 

usage of H.263 in wide range of band widths such as the ones threshold, or if there is no texture similarity between the 

available in the Internet. ISDN, LAN, etc. 40 analyzed block of the current image and the corresponding 

The H.263 includes the ability to indicate a block as <not block of the previous image. 
coded> but the decision on whether a block is to be coded 
or not is not a part of the standard. 

Real-life images very often contain information with a 45 The objects, features and other characteristics of the 

random nature (texture). When image sequences contain invention will become more apparent from the following 

such static textures, the human observer rarely sees any detailed description with reference to the accompanying 

difference in those textured parts when the video runs. If the drawings in which: 

image differences arm examined however, it is seen that FIG. 1 is a block-diagram of the H.263 encoder incorpo- 

large differences do exist. This is mainly due to the impact 5Q rating the conditional replenishment unit according to the 

of imperfect sampling processes, microscopic camera invention, and 

movements, etc. The H.263 coder (as other standard coders piG. 2 is a flow chart illustrating the different steps of the 

like H.261 or MPEG 1/2) is based on prediction and coding process implemented in the conditional replenishment unit 

of the difference between predicted an original images. according to the invention. 

Textured static areas may therefore cause the encoder to 55 

spend many bits on these differences although no significant DETAILED DESCRIPTION OF THE 

information should be coded there. INVENTION 

rmTP<~r<; op thp imvfntion The P ro P° sed conditional replenishment mechanism is a 

UBJbCl^ Ub JHb JfNVfcNiiUfN separate module which is connected to the H.263 encoder. 

Therefore, the main object of the invention is to provide 6 q The schematic description of the H.263 encoder including 

an improvement of the video coding of the H.263 standard the conditional replenishment module is shown in FIG. 1 

type enabling significant improvements in coding efficiency with the additional module and its connections emphasized, 

while maintaining subjective image quality. ' The H.263 encoder is based on the hybrid-DPCM scheme 

Another object of the invention is to provide a replenish- which is used in most of the standard video coders today, 

ment mechanism acting as a pre-processor in the video 65 When the encoding function has to be implemented, Coding 

coding and determining which parts of the image are to be Control Unit 10 controls switching circuits 12 and 14 as 

coded. illustrated in FIG. 1 
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The encoder is composed of a transform coder 16 which a predefined threshold, step 42, the macroblock is coded in 

transforms a prediction error found by subtracting in sub- the usual way at step 44. If the number of moving pixels is 

tractor 18 the current macroblock received in input from a below the threshold, another test determines if there is a 

predicted corresponding macroblock into the frequency group of moving pixels (representing a moving object) 

domain by using the Discrete Cosine Transform (DOT) in 5 which are spread across more than one macroblock at step 

which the information is represented in a compact way 46 - ™ us > even * 0U S h lh f DUn ? ber ° f movmpixels for each 

suitable for compression. The DCT coefficients are then ana ^ macroblock is less than the first threshold but the 

quantized in quantizer 20 with many fewer bits. This total ™a**t of m0V1D | P ael f ' for the f ad J a ^ nt 

M . . \ . . t - Aa „ „ ? n „ trnnr . macroblocks corresponding to the moving object exceeds a 

quantization, which provides a quantizing index q for trans- *~ f wi i * i„ a-a 

c n- • . ' * a *u i ^* n ?*u~ second threshold, the analyzed macroblock is also coded 

form coefficients, introduces the lossy aspect of the video 10 AA . ' /. . , . .. - 

encoder ^ p ^' S ad J acent pixels on the two sides of 

. . the macroblock boundary are tested to see if there is a 

The predicted macroblock used to determine the predic- continuation of lhe mov i n g area in the adjacent blocks. If the 

lion error is provided by a motion estimation unit 22 which test of step 46 m ^ the macroblock frorn lhe current image 

provides motion vectors v pointing to the chosen macrob- and , he one at the matching position in the previous image 

lock in the previous image. are ^mpMed at step 48, for texture similarity using the Auto 

The prediction error and the motion vectors v form the Regressive model parameters found in the smooth/non 
information needed for the reconstruction process in the smooth segmentation mentioned above. Such a test covers 
decoder. Indeed, the prediction of the current macroblock is t he case 0 f a textured object moving over a textured 
performed with respect to the previous reconstructed image background, a case which cannot be detected by the change 
in a similar way as is done in the decoder to avoid any 20 detection due to the random nature of the textured areas, 
mismatch. To achieve this, a complete decoder is actually a macr oblock has been analyzed, either it is coded 
implemented in the encoder loop. All the information sent by ( step 44) ^ [ s defined in the H.263 standard (DCT, quanti- 
se encoder to the decoder is coded using Hufman coding zatioD and VL c coding), or it is to be copied, (step 50). The 
which represents the bits in a compact and efficient way. The option to declare the macroblock as "not-coded" is used as 
reconstruction is performed by taking the quantized and defined in the H.263 standard. A macroblock defined like 
transformed prediction error and performing inverse quan- this does not coota in any QCT coefficient data or motion 
tization in inverse quantizer 24 and inverse DCT (IDCT) in vector data. 

inverse Transform Coder 26. Then, the macroblock pre- since each macrob lock is divided into four blocks, which 

dieted from the previous reconstructed image is added to the afe separatelVj the conditional replenishment module 

prediction error in adder 28 to form the current reconstructed (ests ^ macro51ock t0 sec which blocks lhe moving 

block provided to motion estimation unit 22. Note that this pixels> Qnly blocks that COQlain a number of moving pixels 

process is conducted on each macroblock of the image. aboye a pre _d efined threshold are coded and the rest are 

The control information from Coding Control Unit 10, copied using the method described previously. This way, 

quantizing index q for transform coefficients and motions 35 more savings of bits can be achieved by avoiding the need 

vectors v are then provided to the Video Multiplex Coding t0 tDe en tj re macroblock. 

Unit 30. To avoid possible accumulation of errors in the encoding 
The Conditional Replenishment Unit 32 according to the process (since changes are detected between two successive 
invention is an add-on to the encoder illustrated in FIG. 1 images), there is an option to build the reference image, used 
and does not affect the resultant bitstream. Frame delay 34 ^ m the change detector with the current image, in a different 
is used to get the current original frame and delay it by one way> Qnly blocks which are marked to be coded are copied 
frame. It thus becomes the previous original frame for the ft om the current image to the reference image in the change 
next image analysis and is used as input to Conditional detector. Blocks which are marked copied are not overwrit- 
Replenishment Unit 32 together with the current image. ten. This way, the reference image keeps copied blocks from 
Conditional Replenishment Unit 32 comprises a change 45 the past and if an error accumulates it is detected sooner or 
detector which produces a segmentation map, in which each later and corrected in the encoder. This is an option and is 
image pixel is marked as moving or stationary. It is based on used according to the input data characteristics, 
a statistical model of the difference signal (between the With a Conditional Replenishment unit according to the 
current and previous original frame) and the moving/ invention which has been described above, every image is 
stationary segmentation map. The change detection problem 50 analyzed prior to the encoding process, and only blocks 
is formalized as a statistical Maximum Aposteriori Problem which contain significant and meaningful changes are 
(MAP) and is solved using an iterative minimization coded. Other blocks are actually copied using the option in 
method. To adapt the detector to image content, the image is H.263 to define a block as "not-coded". This way, a signifi- 
first divided, on a block of size 16x16 pixels basis, into cant reduction of the bit-rate can be achieved while main- 
smooth and non-smooth regions. This segmentation of the 55 taining the subjective quality of the reconstructed sequence, 
blocks is based on an Auto Regressive (AR) model of the with this method, the previous image used in the change 
data. The change detector itself works on a pixel level. More detector is updated only with blocks which are to be coded, 
details on the smooth/non-smooth segmentation and the Copied blocks are kept in that image, so accumulated errors 
change detector mathematical modeling and structure can be are detected eventually. The amount of savings is highly 
found in "Change detection for image sequence coding", Z. $o dependent on image content, size and the desired bit-rate. A 
Sivan and D. Mai ah, Proc. Picture Coding Symposium - reduction of as high as 50% can be achieved according to 
PCS 93 Article 14.1, Lausanne, Switzeland, March 1993. test performed on standard ITU test sequences while main- 

The process implemented in Conditional Replenishment taining subjective image quality. 

Unit 32 is illustrated in FIG. 2. The output of change What is claimed is: 

detection step 40 is a segmentation map indicating moving 65 1. In a video encoder comprising motions estimation 

and stationary pixels. For each analyzed macroblock, the means (22) providing a predicted block for each predefined 

number of moving pixels is counted. If this number exceeds block based upon estimating the motion between said pre- 
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defined block of the current image and the corresponding 
block in the previous image, transform means (16) for 
transforming a prediction error resulting from the difference 
between said predicted block and said predefined block into 
the frequency domain, and quantizing means (20) for quan- 
tizing the coefficients of the prediction error in the frequency 
domain and providing the quantized coefficients to a video 
multiplex coding unit (30), wherein said quantized coeffi- 
cients are de-quantized (24) and inverse transformed (26) to 
give back said prediction error and add it to said predicted 
block whereby the result is provided to said motion estima- 
tion means in order to get a new current predicted block; a 
conditional replenishment device characterized in that it 
comprises: 

change detection means for producing a segmentation 
map in which each pixel of said predefined block is 
marked as moving or stationary, said segmentation map 
being obtained by using an iterative minimization algo- 
rithm to solve a problem formalized as a statistical 
Maximum Aposteriori Problem; 

motion determination means for determining whether 
there is a group of moving pixels the number of which 
exceeds a predetermined threshold; and, 

coding means for encoding only blocks containing said 
group of moving pixels. 

2. Conditional replenishment device according to claim 1, 
wherein said motion determination means comprises: 

first means for determining whether the number of mov- 
ing of pixels of each analyzed block exceeds a first 
predetermined threshold, 

and second means for determining whether the number of 
moving pixels in a group of blocks adjacent to said 
analyzed block exceeds a second predetermined thresh- 
old when the number of moving pixels in said analyzed 
block does not exceed said first predetermined thresh- 
old. 

3. Conditional replenishment device according to claim 2 
further comprising similarity means for determining the 
texture similarity between said analyzed block of the current 
image and the corresponding block of the previous image in 
response to said first means determining that the number of 
moving pixels in said analyzed block does not exceed said 
first predetermined threshold and to said second means 
determining that the number of moving pixels in a group of 
adjacent to said analyzed block does not exceed said second 
predetermined threshold; and wherein said coding means 
encode also those blocks for which there is no such a 
similarity. 

4. Conditional replenishment device according to claim 3, 
wherein said analyzed block is a macroblock composed of 
16x16 pixels. 

5. Conditional replenishment device according to claim 4, 
wherein said macroblock is divided into four blocks of 8x8 
pixels and said first determining means determines which 
ones of said blocks in the macroblock contain a number of 
pixels which exceeds said first predetermined threshold in 
order to code only these blocks. 

6. Conditional replenishment device according to claim 5, 
wherein said transform means (16) use the Discrete Cosine 
Transform (DCT) for transforming said prediction error into 
the frequency domain. 

7. In a video encoder comprising motions estimation 
means (22) providing a predicted block for each predefined 
block based upon estimating the motion between said pre- 
defined block of the current image and the corresponding 
block in the previous image, transform means (16) for 
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transforming a prediction error resulting from the difference 
between said predicted block and said predefined block into 
the frequency domain, and quantizing means (20) for quan- 
tizing the coefficients of the prediction error in the frequency 

5 domain and providing the quantized coefficients to a video 
multiplex coding unit (30), wherein said quantized coeffi- 
cients are de-quantized (24) and inverse transformed (26) to 
give back said prediction error and add it to said predicted 
block whereby the result is provided to said motion estima- 

3Q tion means in order to get a new current predicted block ; 
process of conditionally replenishing the video image 
wherein: 

the pixels for each block are detected as moving or 
stationary using an iterative minimization algorithm to 
i5 solve a problem formalized as a statistical Maximum 
Aposteriori Problem; 
and wherein the analyzed block is coded if: 
the number of moving pixels in said analyzed block 
exceeds a first predetermined threshold, or if 
2Q the number of moving pixels in group of blocks adja- 
cent to said analyzed block exceeds a second prede- 
termined threshold. 

8. Process of claim 7 further comprising steps of condi- 
tionally replenishing, wherein said analyzed block is a 

25 macroblock of 16x16 pixels. 

9. Process of claim 8 further comprising steps of condi- 
tionally replenishing, wherein said macroblock is divided 
into four blocks of 8x8 pixels and wherein it is determined 
which ones of said 8x8 blocks in the macroblock contain a 

3Q number of pixels exceeding said first predetermined thresh- 
old in order to code only these blocks. 

10. In a video encoder comprising motions estimation 
means (22) providing a predicted block for each predefined 
block based upon estimating the motion between said pre- 

35 defined block of the current image and the corresponding 
block in the previous image, transform means (16) for 
transforming a prediction error resulting from the difference 
between said predicted block and said predefined block into 
the frequency domain, and quantizing means (20) for quan- 
4Q tizing the coefficients of the prediction error in the frequency 
domain and providing the quantized coefficients to a video 
multiplex coding unit (30), wherein said quantized coeffi- 
cients are de-quantized (24) and inverse transformed (26) to 
give back said prediction error and add it to said predicted 
45 block whereby the result is provided to said motion estima- 
tion means in order to get a new current predicted block; a 
conditional replenishment device comprising: 
change detection meaas for producing a segmentation 
map in which each pixel of said predefined block is 
50 marked as moving or stationary; 

motion determination means for determining whether 
there is a group of moving pixels the number of which 
exceeds a predetermined threshold defining a moving 
object in the image, 
55 coding means for encoding only blocks containing said 
group of moving pixels; and, 
similarity means for determining the texture similarity 
between said analyzed block of the current image and 
the corresponding block of the previous image in 
60 response to said determination means determining that 
the number of moving pixels in said analyzed block 
does not exceed said first predetermined threshold; and 
wherein said coding means encode also those blocks 
for which there is no such a similarity. 
65 11. Conditional replenishment device according to claim 
10, wherein said analyzed block is a macroblock composed 
of 16x16 pixels. 
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12. Conditional replenishment device according to claim 

11, wherein said macroblock is divided into four blocks of 
8x8 pixels and said first determining means determines 
which ones of said blocks in the macroblock contain a 
number of pixels which exceeds said first predetermined 
threshold in order to code only these blocks. 

13. Conditional replenishment device according to claim 

12, wherein said transform means (16) use the Discrete 



8 



Cosine Transform (DCT) for transforming said prediction 
error into the frequency domain. 

14. Process of claim 7 wherein the analyzed block may be 
further coded if there is no texture similarity between said 
analyzed block of the current image and the corresponding 
block of the previous image. 
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